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Abstract  -  A  variety  of  results  are  known  for  the  information  capacity  of  the  Poisson  chan¬ 
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results  are  shown  to  carry  over  in  some  form  to  the  case  of  mean-square-constrained  encod¬ 
ing  intensity  E[x?]£P2.  "On-off  keyed"  encoder  intensity  is  considered.  All  results  are 
given  for  general  finite  channel  base  measure. 
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Introduction 


The  Poisson  channel  model,  sometimes  called  the  Poisson-type  point  process  channel  or  the  direct 
detection  photon  channel,  models  optical  communications  systems  as  described  in  [2],  [6],  [9],  [15].  The 
Poisson  channel  is  a  continuous-time  additive  noise  channel  with  output  Y,=X,  +N,  where 
N={N,)os1sr  is  the  channel  noise  and  X=[X, )0s/sr  is  the  transmitted  signal  into  which  is  encoded  the 
message  0={0,  }os/sr.  All  processes  in  the  channel  model  are  defined  on  a  common  probability  space 
(Cl, IF  ,P).  We  write  F 8  for  the  natural  history  of  0,  Fr  for  the  natural  history  of  Y ,  etc.  By  history  we 
mean  a  nondecreasing  sequence  of  a-algebras.  The  natural  history  of  a  process  Z  is  Fz  where 
Ff=a[Zs,0<s  <t]. 
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Figure  1.  Poisson  channel  model. 


In  the  Poisson  channel  both  X  and  N  are  Poisson-type  point  processes  [11].  Thus,  X  and  N  have 
respective  compensating  measures 

A(£)  =  |x,6(dO.  B(E)=1jk(i)b(di)  (1) 

for  all  E  e  1B[0J].  b  is  called  the  channel  base  measure  and  is  assumed  to  be  finite;  bj<° °  where 
bT  =b([Q,T])/T .  The  encoding  intensity  x>  is  required  to  be  F^F^ -predictable;  this  allows  nonantici- 
pative  message  encoding  and  causal,  noiseless,  instantaneous  feedback  from  the  channel  output.  The 
noise  intensity  X(t)  is  assumed  to  be  nonrandom.  Hence  the  channel  noise  N,  is  a  nonhomogeneous 
Poisson  process.  The  channel  output  Yt=X,+N,  is  the  sum  of  two  Poisson-type  point  processes.  Thus  it 
is  also  a  Poisson-type  point  process  with  intensity  q,  =Xi  +MO  where  q,  is  predictable  with  respect  to 
the  global  history  F9~FY.  Poisson-type  point  process  intensities  are,  by  definition,  nonnegative. 
Within  the  context  of  optical  communications  [2],  [15],  X(t)  (see  Figure  1)  represents,  nominally,  noise 
due  to  background  radiation  as  seen  by  the  receiver.  We  make  the  usual  assumption  that  the  message 
and  noise  are  independent,  i.e.,  that  the  histories  F®  and  FN  are  independent 

Existence  and  uniqueness  of  the  compensating  measures  A  and  B  in  the  Poisson  channel  model 
can  be  established  by  means  of  the  Doob-Meyer  submartingale  decomposition  [2],  [11],  [14],  X  and  N 
are  each  submartingales,  therefore  each  of  them  has  associated  with  it  a  unique  predictable  increasing 
process,  A,  and  B,,  respectively,  such  that 


X,  -  A, ,  N,  -  B, 
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are  each  martingales.  Making  the  identifications  A([0,/])  =  A,  and  i?  ([0,r  ])=£,,  the  compensating  meas¬ 
ures  are  also  seen  to  exist  uniquely.  Alternatively,  uniqueness  and  existence  of  the  measures  A  and  B 
can  be  established  using  projection  methods  [5], 


A  notable  feature  of  the  Poisson  channel  model  is  the  presence  of  two  different  sources  of  noise. 
Besides  the  channel  noise  represented  by  N, ,  there  is  an  encoding  noise  inherent  to  the  channel.  Encod¬ 
ing  noise  arises  because  the  message  0  is  encoded  indirectly  into  X  via  the  intensity  x  =  X(®>^)-  The 
path  of  X  is  influenced  by  x<  and,  also,  by  the  innovation  martingale  m,  =X,  -A,  deriving  from  the 
Doob-Meyer  decomposition  of  X .  Thus,  a  trajectory  of  X  over  a  finite  interval  [0,7]  is  insufficient  to 
recover  Xt  even  when  the  noise  intensity  X(r)  is  identically  zero.  Hence,  one  speaks  of  both  channel 
noise  and  encoder  noise  in  the  Poisson  channel. 


Information  capacity  is  defined  in  terms  of  the  average  mutual  information  7r[0,y]  in  the  mes¬ 
sage  and  channel  output  processes,  0  and  Y  over  the  interval  [0,7].  Let  |i«,  Mr.  and  p«r  be  the  marginal 
and  joint  measures  induced  by  the  message  and  output  processes,  0  and  7,  on  the  spaces  S°.  Sr.  and 
f,xjr  where  Se  and  Sr  are  the  spaces  of  trajectories  of  9  and  Y  over  the  interval  [0,7],  Write  the 
induced  product  measure  as  M«xr-  Then,  the  average  mutual  information  in  0  and  Y  over  the  interval 
[0,7]  is  [13] 


7r[0,y]  =  E 


d  Mexr 


provided  M«r«M®*r;  otherwise  7r[0,y]  =  <».  Expressions  exist  for  the  average  mutual  information  over 
the  interval  [0,7]  in  the  Poisson  channel  with  base  measure  b  and  channel  output  intensity  p(.  Define 


l\-E 


r 

/(Hi  lnt|*  —  "Hi  Infi,  )b  (dc ) 
.0 


(2) 


where  f|,  =  £[ti,  1  IFj).  Note  that,  in  the  terminology  of  Boel,  Varaiya,  and  Wong  [1],  f|,  is  the  intrinsic 
local  description  of  the  channel  output  process  Y  whereas  p,  is  an  extrinsic  local  description  (with 
respect  to  the  history  F^F* .)  According  to  Liptser  and  Shiryayev  [11],  Ir[Q,Y]=/1  provided  L 
(and,  as  a  consequence,  Mor«M«*r  for  I\  <«>.)  A  useful  equivalent  expression  for  the  channel  informa¬ 
tion  is 


h 


-■t\,]nf\,)b(dt) 


(3) 


We  have  7 2= / 1  so  that  7r[0, 7]=72  if  72< ».  The  channel  information  capacity  is 

Ce-vo  =  SUeP  Sf  jlTm) 


where  0  is  any  jointly  measurable  process  defined  over  the  interval  [0,7]  and  X  =  X(0.T)  *s 
F9''Fr -predictable  mapping. 

The  information  capacity  of  the  Poisson  channel  was  first  found  by  Kabanov  [9]  for  the  case  of  a 
peak-constrained  encoder  intensity  0<Xi^c  and  constant  noise  intensity  X(r)  =  X.  Considering  only 
Lebesgue  channel  base  measure,  he  showed  that 

C  =  C  (X,c  )  (4) 


where 


£ 

*  ' 

1  +  * 

1  +x/y 

-  X 

9  * 

i  +  - 

In 

1  +  * 

e 

X 

X 

(5) 


The  approach  taken  by  Kabanov  was  adapted  by  Frey  [6]  to  treat  the  case  of  time-varying  noise  inten¬ 
sity  X(r)  and  time- varying  peak  constraint  O^Xt  -  c 0)-  For  time-varying  channel  parameters  X(r)  and 
c  (/ )  and  finite  channel  base  measure  b ,  we  have 
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C=j^C(X(t),c(:))b(dt). 

In  other  work  along  these  lines,  Davis  [4]  treated  the  case  =  c{t)-c ,  and  Lebesgue  b  in  which 
an  additional  average  constraint 

,  T 

j^E[Xl]dt<k 


is  imposed  on  the  encoder  intensity  and  showed  how  the  capacity  is  modified.  Recently,  Wyner  [16] 
showed  that  the  coding  capacity  equals  the  information  capacity  for  the  case  considered  by  Davis. 
Wyner  also  found  an  analytic  expression  for  the  channel  error  exponent  in  this  case.  An  earlier  contri¬ 
bution  in  this  area  is  that  of  Massey  [12]. 

In  this  paper  we  consider  a  mean-square  constraint 

E[x}\<P2  (6) 


( P  >0)  on  the  encoder  intensity.  This  constraint  is  sufficient  for  finite  information  capacity.  In  fact,  one 
has  that 


Ir<E 


=  E 


Zb([0J]) 


P2+ 


1 


<  oo 


which,  as  already  observed,  is  sufficient  for  /r  [9,  Y]  to  be  expressible  by  either  Ix  or  1 2- 
A  mean-square  constraint  such  as  (6)  or  the  similar  constraints 

1T 


j^E[x?]b(d')<P2,  E[yJ]<P\t) 


have  an  intuitive  power/energy  interpretation.  Also,  they  recall  the  mean-square  constraint 

,  r 

yj£K>,2]<*  <  P2 


appearing  in  Kadota,  Zakai,  and  Ziv’s  treatment  [10]  of  the  additive  white  Gaussian  noise  channel 

f 

Y,  =  J<M'  +  W, 

where  W,  is  Wiener  noise.  These  considerations  and  the  fact  that  (6)  is  sufficient  for  finite  capacity 
brought  us  to  obtain  the  results  we  now  present  for  the  Poisson  channel  information  capacity. 
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INFORMATION  CAPACITY 


We  will  assume  without  further  mention  that  br  =  1  as  in  the  particular  case  of  Lebesgue  base 
measure.  This  entails  no  real  loss  of  generality  and  clarifies  the  presentation.  Four  theorems  are  given. 
Theorem  1  considers  the  special  case  of  the  Poisson  channel  with  zero  noise  intensity  and  states  that  the 
channel  information  Capacity  is 


e 


for  a  mean-square  constrained  encoder  intensity  as  in  (6).  Theorem  2  takes  up  the  case  of  general  X(t ) 
and  gives  an  expression  for  the  capacity  for  the  case  in  which  the  encoder  intensity  is  restricted  to  only 
two  values  -  the  "on-off  keying”  case.  Theorem  3  gives  upper  and  lower  bounds  on  the  capacity  for  the 
general  case  in  which  "on-off  keying"  is  not  necessarily  stipulated.  Theorem  4  also  treats  X(t)>  0  but 
with  a  formulation  of  the  mean-square  constraint  on  the  encoder  intensity  which  allows  one  to  give  an 
exact  expression  for  the  capacity.  We  conclude  with  a  conjecture  and  some  comments  regarding  jam¬ 
ming  and  coding  capacity. 

The  following  lemmas  set  the  stage  for  Theorem  1. 

Lemma  1:  Let  D  be  a  Borel  subset  of  IR  and  define  AD  to  be  the  class  of  random  variables  with 
range  in  D .  Let  /  be  a  real  function  and  suppose  the  inverse  of  / ,  exists  on  D .  For  P  e  D ,  define 
A  =  {Y  e  Ad:  E[f  (X)]<f  (P)}.  Let  g  be  a  real  function  such  that  g°/-1  is  concave  and  nondecreasing. 
Then 


?nE[gc*)]««cp). 


Proof:  Define  h  =  g°f~\  Then 

E[g(.X)]  =  E[h(f(\))]<h{E[f(X)])Zh(f(.P))  =  g{P). 

Let  X  =P .  Then  X  €  A  and  E  [g  (X )]  =  g  (P ).  The  result  follows. 

Lemma  2:  Let  A  =  {Y>0:  E[X2]<P2)  be  the  class  of  nonnegative  random  variables  with  con¬ 
strained  second  moment.  Then 


where 


XeA  E  [X  lnX  ]  =  l(P) 


y(P)  = 


P\nP  , 

£l 

* 

e 


P  >e 
P  <e 


(7) 


Proof.  h(. x)  =  y(fx  )  is  concave  and  increasing  for  x  >0  so,  by  the  previous  lemma, 

E\X\nX)<E[y(X)]<y(P). 


Therefore 


xs“pA£fYlnY]  <  y(P). 
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Suppose  P  >e.  ChooseX  =P  .  Then  £[XlnX]  =  7(£).  Supposed  <e.  Choose 

P2 

0,  w.p.  1 - — 

x  =  e : 

x  p 2 ' 

e  ,  w.p.  — 

L  e 

Then,  again,  £[XlnX]  =  (?2/e2)flne  =7 (£).  Hence,  for  all  P  >0,  a  random  variable  X  e  A  exists  such 
that  £  [X  lnX  ]  =  7  (P ).  (7)  is  proved. 


Lemma  3:  For  the  Poisson  channel  with  finite  base  measure  b  and  mean-square-constrained 
encoding  intensity  as  in  (6), 


C  < 


7(P)  +  - 

e 


When  there  is  no  noise  intensity  present  (X(<)  =  0), 


C  > 


Proof.  The  upper  bound  follows  from  Lemma  2  and  an  application  of  Jensen’s  inequality, 
It  7 

<J(£[X,InxJ-£[x,]ln£[x,])fi(dO. 


/^te.n  =  £ 


)b(dt) 


lo  establish  the  lower  bound,  one  uses  Bremaud’s  averaging  principle  [3]  with  a  sequence  of  stationary 
random  telegraph  signal  [8]  message  processes  =  1,2,...]  having  common  state  space  (0,e£  j 

and  generator  mA  where  A  is  the  matrix 


-1  1 

iz£  _il£ 

,  P  P  . 


(8) 


and  p=e~2.  In  the  channel  with  these  message  processes,  £[w(x,(m))]  =  £[w(0/"'))]  =  (£/e)ln(Pe)  and 
£ [x,<">]  =  £ [0,(m)]  =  P/e.  Thus  C>(P/e )ln(Pe )-(Ple )ln (P le)  =  2 Pie. 


Theorem  1:  Consider  the  Poisson  channel  with  finite  base  measure  b  and  zero  noise  intensity. 
Suppose  the  encoding  intensity  Xi  satisfies  the  mean-square  constraint  £  [X/23  ^  ^ 2-  Then  the  channel 
information  capacity  is  C  =  2P/e . 


Proof.  Let  /  [X  ]  =  £  [X  lnX  ]  -  £  [X  ]ln£  [X  ]  and  define  IB  (P )  for  each  P  >  0  to  be  the  class  of  non¬ 
negative  random  variables  X  such  that  £[X2]<£2.  For  each  X  e  £?(£),  we  can  write  X  =PZ  for  some 
ZeiB(l).  For X  =  PZ ,  /  [X  ]  =  £/  [Z  ].  Thus 


C  =  sup 
X€ 


W*  1  -  nfi^nrz] 


sup 


We  observe  that  the  upper  and  lower  bounds  on  capacity  given  in  Lemma  3  coincide  for  P  =  1.  There¬ 
fore 


sup 

Z  €  JB  (1) 


Thus  C  =  2P/e  and  the  proof  is  complete. 

Corollary:  Consider  the  Poisson  channel  with  finite  base  measure  b ,  zero  noise  intensity,  and 
encoder  intensity  satisfying 
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E[X?]<P2(t),  0  <t<T 

where  P(i)  is  b-integrable.  Then  the  channel  capacity  is 

1  br 

~\nt)b{dt). 


Proof.  If  P(t)  is  b-integrable  then  f<<=o  and  the  usual  expressions  (2)  and  (3)  for  channel  infor¬ 
mation  can  be  used.  Then  the  result  follows  from  approximating  P(t )  by  a  simple  function  P„(t)  and 
passing  to  the  limit  as  PK(t)-*P(t)  pointwise. 


Corollary.  The  results  of  Theorem  1  are  unchanged  by  substituting 

j\E\x}]dt  <P2 

for  the  stronger  constraint  £  [ x?]-P*  use^  Theorem  1. 

Proof:  Write  E[x}]  =  m(t).  From  the  previous  corollary 

where  T  is  the  class  of  nonnegative  functions 

,  T 

T=  {m(t):m(t)Z0,  j^m(t)b(dt)<P2) . 

The  square  root  function  is  nonengative,  increasing,  and  concave  so  by  a  "waterpouring"  argument  [6] 
m(t )  is  optimally  chosen  in  (9)  to  be  the  constant  m(t)=P2.  Thus  C-2Pte  and  the  corollary  is 
proved. 


We  now  turn  to  the  case  of  nonzero  noise  intensity.  This  case  is  not  as  tractable  as  the  case  of 
zero  noise  intensity  treated  in  Theorem  1  and  its  corollaries  and  at  present,  with  one  exception,  only 
bounds  and  asymptotic  results  can  be  given.  The  exception  referred  to  is  the  "on-off  keying"  case  -  the 
case  in  which  the  encoder  intensity  switches  between  only  two  values  (neither  of  which  are  necessarily 
zero).  For  "on-off  keyed"  encoder  intensities,  we  can  and  do  (Theorem  2)  give  an  expression  for  the 
capacity.  It  is  clear  that  when  the  encoder  intensity  is  restricted  to  only  two  values,  then  one  of  these 
values  should  be  chosen  to  be  zero  (to  maximize  the  channel  information  rate.)  Transitions  of  the 
encoder  intensity  between  its  zero  value  and  its  second  (positive)  value  might  typically  be  accomplished 
by  turning  on  and  off  a  power  source.  Hence  the  nomenclature  "on-off  keying." 


For  x  >0,  define 


k{x)  =  — 
e 


X  +  l 

-  X  . 


(10) 


Also,  for  channel  parameters  X  and  P ,  let 

A  =  [a  >0:  a2k  (X/a)>£2) . 

The  function  / (a)  =  a2k(K/a)  is  increasing.  Therefore  A  is  a  semiinfinte  interval  of  the  form  (a0,«5) 
where  a2k(k/a0)  =  P2.  Theorem  2  follows  readily  from  the  following  lemma. 

Lemma  1:  Let  /  [*]=£[(*  +X)ln(Y  J  ?.)]-(£fX]  +  X)ln(£[X]  +  X)  and  define  BP  for  each  P>  0 
to  be  the  class  of  nonnegative  random  variables  X  having  two  possible  values  and  such  that  E[X2]<P2. 
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sup  ;fyi  _  max  P^_.  ak(k /a)  +  \ 
XeBf,  aeA  Q  n  p2jiQM /p2ya  +X 


Proof:  For  P  =0,  the  RHS  of  (11)  is  zero.  Therefore,  in  this  case  (11)  is  true.  Thus  we  only 
address  P  >0.  Let  BP0  be  the  class  of  nonnegative  random  variables  X  of  the  form 


Then  as  was  observed  above 


Hence  we  need  to  show  that 


0,  1  -p 


sup  ,fYi_max  P2,  ak(X/a)  +  X 
X  e  JBp0  ae/i  a  p2kQul/p^/a  ■ 

The  proof  (13)  is  conducted  in  3  steps. 

Step  1:  For  X  e  BP0  as  in  (12)  with  0 <p  <1,  suppose  a  <P  and  write 

y  fa+5.  p 

[0,  1  -p 

Y  €  BP0  for  5e  [0,/’ -a].  Define  h=I[Y].  We  have 

j  a  +8  +  \ 

95  P  p  (a  +  5)  +  X  ' 

9/  5 

—  >0for5e  [O.E-al  so  I[X]=I0<IP_a  and 

(w) 

«!.ere  BPS  is  the  class  of  random  variables  X  e  BP 0  as  in  (12)  such  that  a  >P . 

Step  2:  Fix  P  > 0.  Suppose  X  e  BPS  with  a  fixed  {P  <a).  Then,  as  a  consequence  of  the  inequal¬ 
ity  E[X2]<P2,  p  =P  (X  =a}  is  restricted  to  tin’  range 

0£p£~.  (15) 

Let  us  identify  the  value  of  p  which  maximizes  /[X],  Define 

m  _  A)  ^  _  xjnX. 
a 

where  <j)(xo,)  =  (x  +y)ln(x  +y)-ylny.  Then,  for  X  e  BPS, 

1  [X  ]  =  £  ((X  +  X)ln(X  +  X)]  -  (£  [X  ]  +  X)ln(E  [X  ]  +  X) 

=  £  [(X  +  X)ln(X  +  X)  -  mX  -  b  ]  +  mE  [X  ]  +  b 
-  (£[X]  +  X)ln(£[X]  +  X) 

=  m£  [X  ]  +  b  -  (£[X]  +  X)ln(£[X]  +  X). 

Define  / (x )  =  mx  +  b  -  (x  +  X)ln(x  +  X).  For  all  x  >  0, 

|Um-l-ln(x+X),  4lL  =  -— I—  <  o. 

«  dx2  (x  +  X) 
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Thus  I  (x)  has  a  unique  maximum  at  x  =em~]  -X.  Using  k{x)  defined  in  (10),  p  =E[X\'a ,  and  the 
identity 

0(a  X)  =  a  ln(ai-  (X'a )  +  X)  *  a  , 


we  have  that,  subject  to  the  constraint  in  (15),  the  choice  of  p  which  maximizes  /  [X]  is 


p  =  k  Qda )  *■  — y . 


H6) 


Let  Bz  be  the  random  variables  X  e  BPS  with  p  given  as  in  (16).  Then 


SUP  /rvi  =  3LJF  71Y1 

X  e  B„  ‘ 1  X  e  Bz  1 LA  1 


sup 


07) 


Step  3:  For  X  €  Bz,  I [X]  assumes  one  of  two  forms  depending  on  p  in  (16).  .After  some  algebra, 
one  has 


/[*]  = 


,,,.  _ ak  ( X'a )  +  X _ 

3  (  a)lnak  (X'a  )k  C X'(ak  ( X'a )))  +  X  ’ 
P 2  ak  (Xa )  +  X 
a  n  P2k(X(P2;a))fa  +  X  ’ 


a2k(X‘a)<P2 
a2k(Xa)>P 2 


(18) 


Consider  the  first  case  in  (18)  in  which  a2k(X’a)<P2.  Letting  a  =  a/X.  we  have  l[X  ].■  X  =  G  (ak  (1  a)) 
where 


G  (s )  =  s  In 


s  + 1 

J*(l/5)+l  ' 


Now 


-^-ak(Va)  = 
a  a 


(1  +  a) 


1*1/0 


eor 


>  0 


for  all  a>0.  Thus  a/t(l/a)  is  a  nondecreasing  function  of  a.  Then,  also,  G(s)  is  a  nondecreasing  func¬ 
tion  of  s.  Therefore,  for  a2k(Xla)<P2  the  maximum  value  of  /[X]  is  found  at  a  satisfying 
a2k(Xla)=P2.  Hence 


sup  . rY,  .  max  P2.  ak(Xig)  +  X 
Xefflpj  aeA  Q  p^k.(Xa lP2)la -^X  ' 


(19) 


Therefore,  by  (14),  (17),  and  (19),  our  desired  result  (13)  is  obtained. 


Theorem  2:  For  the  Poisson  channel  with  noise  intensity  X,  mean  -quare  constraint  parameter  P , 
finite  6,  and  "on-off  keyed"  encoder  intensity,  the  information  capacity  is  C  =  D0(X,P)  where 

ak(X/a)  +  X 


Also,  a sf -)», 


D0(X,P)  -  -P 
€ 


(20) 


for  any  fixed  X. 


Proof'.  By  Jensen’s  inequality. 


rsxSVt*). 
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By  the  usual  choice  of  sequence  of  random  telegraph  signals  and  using  Bremaud’s  averaging 

principle  in  taking  the  limit  as  m  we  find  that 


C  > 


sup 
X  €  Bf 


m. 


By  Lemma  4,  then,  C  =  D0(X,P)- 

To  prove  (20),  first  observe  that 

(0,  P ) 


max 
a  £  A 


Thus  D0  (X,  P )  <  IP  le  .  Consider  a  =eP  .  We  have  a  e  A  so 


DAKP)>_Elln^Men 2±L_ 

eP  P2k(kePiP2)1(eP)  +  \ 

-  xi<2  + 

e  e  k{e  :Q)  +  e  /Q 


where  for  convenience  we  have  used  Q  =P/\.  Expanding  k(x)  as  in  [4,  Lemma  2]  and  using  the  loga¬ 
rithm  expansion  ln(l  +  x)  =  x  +o(x)  for  x  ->0,  we  obtain 


1  +  — — 

D.a.P)  >  X-Q  +  An - 

e  e  .  23  e 


■  o  (\tQ ) 


i  +  Tfr°(i/C) 


23  J_ 
12  eQ 


23  _e_ 
12  Q 


o(UQ) 


+ 


23  .  e  -  1 
12  ^  e2 


o(l). 


(20)  foUows  and  the  proof  is  complete. 

In  terms  of  the  dimensionless  quanuues 


we  have  D0(k,P)  =  \D0(\,Q)  where 

D0(\,Q)  = 


sup  g2  at(l'q)>  1 
aJt(]/a)sc3  a  Q2k(a/Q2)fa+  1  ' 


The  shape  of  the  curve  defined  by  D„(\,Q)  is  remarkably  similar  to  that  defined  by  C(l.g).  Both  are 
increasing  concave  functions.  Kabonov  found  in  [8]  that  C(l.g)-  Q/e.  Thus  the  two  curves  each  have 
linear  asymptotes.  These  comments  are  illustrated  in  Figure  2. 
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It  will  be  useful  to  have  a  notation  for  the  information  capacity  of  the  Poisson  channel  with 
mean-sHuare  constraint.  Define  D(X,P)  to  be  the  capacity  of  the  channel  with  constant  noise  intensity  X 
and  mean-square-constrained  encoder  intensity  E[x?]^P2-  In  this  notation  Theorem  1  states  that 
D (0,P)  =  2P /e .  Analogous  to  the  fact  that  C  (Xz  ,cz )  =  zC  (X,c )  for  Kabanov’s  capacity  function  stated 
in  (5),  we  have 

D(Xz,Pz)=zD(X,P).  (21) 


To  see  this  we  use  w(x)H;tlnx  and  write 

_  sup  sup 


Z>(Xx,Pz)  =  T£[^V^£ 


_  sup  sup  J  p 

0  I 


I^CC,  +z*.)-w(£,  +  rX)jfi(dr) 

b 


!z  +  X)lnX  -  (Xi lz  +  X)lnX  \b  (di ) 


The  second  term  within  the  braces  is  zero  so  (21)  follows. 

From  (21)  we  have  D(X,P)  =  XD(\,P/X)  for  P  > 0.  Thus,  determining  D(X,P)  is  reduced  to  the 
problem  of  finding  the  one-parameter  function  £)(1,). 

Theorem  3:  For  the  Poisson  channel  with  noise  intensity  X,  mean-square  constraint  parameter  P , 
and  finite  b ,  the  information  capacity  C=D(X,P)  admits  the  bounds 

D0(X,P)<D(X,P)<-P  .  (22) 

e 

Also,  as  P  — » oo, 

D(X,P)--P  (23) 

e 


for  any  fixed  X. 

Proof:  The  first  inequality  in  (22)  follows  from  Theorem  2  and  the  second  from  Theorem  1.  (23) 
follows  directly  from  (21)  in  Theorem  2. 

The  next  theorem  is  related  to  and  motivated  by  results  obtained  by  Davis  for  polarization  modu¬ 
lation  and  by  Frey  [6]  for  peak -constrained  encoder  intensity.  Davis  [4]  showed  that,  when  operating 
two  orthogonally  polarized,  separately  modulated  Poisson  channels,  channel  capacity  is  maximized 
when  encoder  intensity  is  not  distributed  over  both  channels  but,  instead,  is  concentrated  solely  in  one 
channel.  This  was  because  of  the  convexity  of  the  channel  capacity  function  C{x,y)  in  y.  Frey  [6] 
showed  that  for  this  same  reason  it  is  also  better  not  to  distribute  encoder  intensity  over  time  but, 
rather,  to  concentrate  it  into  as  short  a  time  interval  as  possible.  This  result,  obtained  for  peak- 
constrained  encoder  intensity,  is  equally  valid  for  mean-square-constrained  encoder  intensity.  Thus  con¬ 
sider  the  Poisson  channel  with  continuous  finite  base  measure  b,  nonrandom  noise  intensity  X(t),  and 
encoding  intensity  x<-  Suppose  the  encoder  intensity  is  mean-square-constramed  0 <E[x2]iP2(t)  but 
that  P(t)  is  not  some  given  function.  Suppose,  instead,  that  P(t)  may  be  chosen  freely  subject  only  to 
the  constraint 


for  some  given  Q  >  0.  Then,  in  Theorem  4,  the  channel  capacity  is  found  to  be 

e 


(25) 
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Notice  that  X(r)  is  missing  from  this  expression.  In  the  proof  of  (25)  the  power  P(i)  available  to  the 
encoder  is  concentrated  into  as  short  a  time  interval  as  possible  to  obtain  a  rate  of  average  mutual  infor¬ 
mation  in  the  channel  closer  and  closer  to  2 Qfe.  Concentration  of  the  encoder  intensity  into  a  short 
time  interval  permits  it  to  be  very  large  without  violating  the  constraint  (24)  on  P(t).  By  concentrating 
the  encoder  intensity  into  a  short  time  interval,  it  can  be  made  so  large  that  it  completely  overshadows 
whatever  noise  intensity  is  present  in  the  interval  in  which  the  encoding  intensity  is  applied.  In  passing 
to  the  limit,  the  magnitude  of  the  noise  intensity  becomes  irrelevant. 

Theorem  4:  Suppose  the  noise  and  encoded  message  processes,  N  and  X ,  in  a  Poisson  channel 
have  intensities  \(t)  and  with  respect  to  a  finite  continuous  base  measure  b.  Also,  suppose  the 
encoder  intensity  %i  is  adapted  and  mean-square-constrained  0<£[x2]^2(O  and  allow  P(t) 

to  be  chosen  freely  provided  only  that  P(t)e  r  where  T  is  the  class  of  nonnegative  functions 

T=  [P{t)>0:  yj P{t)b{dt)<Q). 

2Q 

Then  the  channel  capacity  is  C  =  — 

e 

Proof :  By  Corollary  1  of  Theorem  1, 

T 

C=Spfrj^D(Ht),P(t))b(dt). 


Define 


Ti  «  [P{t)>0: 


,  t 

jjP(i)b(di)  = 


Q }■ 


D(x,y )  is  nondecreasing  in  y  so 


1  •*  ft 


D(x,y )  is  nonincreasing  in  x  and  D(0,y)  =  2y/e  so 

C  <  /“P  ±[D{0,P{t))b{dt)  =  S“P  jj  ^^-b(dt)  =  —Q  . 
r*1 1 T J0  r6‘  i  TJ0e  e 

Next  we  show  C>2Q/e  to  complete  the  proof.  Let  G  =  {t  e  [0,T]:  X(r)<L]  and  choose  L  so 
that  b{G)>  0.  Define  XL(t)=L  on  G  and  XL{t)=°°  elsewhere  on  [0,7].  Then 
D  (Ml  ),P(t))>D  (\L  (t),P(t )).  Therefore 

,  t 

€  >  /“P  j^D(\L(t),P{i))b(di) 


sup 


p*r 


:.n- 


D(L,P(t))b  (dt). 


Let  5  be  the  set  of  all  nonnegative  b  -measurable  simple  functions  on  [0,7].  Then 


b 

sup  ! 


=  M %  P*rxr,S  f=\D(L,P{t))b  (dl ) 

=  M>Q ,  S  A  )D  (L ,  P  (/  ))b  {dl ) 

P  SM  1  b 
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where  Q\  =  QTlb(G)  and  SjcS  is  the  set  of  all  functions  in  S  which  vanish  outside  G.  For  each 
M  >Qi,  let  Pu(t)=M  lA(r)  where  A  <zG  with  b(A)  =  QT/M .  PM  e  T^S {  so 

C>MS“J,  j\ D(L,PM(t))b(dt) 

»  G 

"r 

We  have 

«  »2,  ^r1  =  *>&,  »  =»«')  -  2" 

so  the  proof  is  complete. 

F/na/  comments'.  Much  more  could  be  said  about  the  Poisson  channel  with  the  mean-square  con¬ 
straint  considered  here.  Following  Wyner  [16],  one  could  derive  the  channel  coding  capacity  by  consid¬ 
ering  the  Poisson  channel  as  a  discrete  binary  Z-channel  [7]  and  taking  the  appropriate  limits.  One 
would  find  that  the  coding  capacity  and  the  information  capacity  were  equal.  Also  following  Wyner, 
one  might  try  to  calculate  the  channel  error  exponent.  Whether  or  not  an  analytic  expression  exists  for 
the  error  exponent  in  the  present  case  and  what  its  form  might  be  are  unknown.  Even  to  obtain  only  the 
cut-off  rate  [4],  [12],  [16]  would  be  interesting.  All  these  various  results  could  be  extended  to  the  case 
of  time- varying  channel  parameters  X(t)  and  P(t).  Then,  as  in  [6]  for  the  peak-constrained  Poisson 
channel  with  noise  intensity  X(t),  the  optimal  jamming  solution  could  be  pursued.  For  the  mean- 
square-constrained  "on-off  keyed"  Poisson  channel,  the  information  capacity  D0(X,P)  is  nonnegative, 
decreasing,  and  convex  in  X.  Therefore,  the  optimal  jamming  intensity  will  be  nonrandom  and 
"waterfilling"  [6],  If  D(X,P )  proves  to  be  convex  in  X  then  in  this  case  loo,  one  would  obtain  a 
"waterfilling"  solution  to  the  jamming  problem. 

In  the  peak-constrained  Poisson  channel  the  capacity  with  and  without  an  "on-off  keying"  restric¬ 
tion  on  the  encoder  intensity  is  the  same.  Based  on  this,  some  computer  calculations,  and  the  fact  that 

£)(0,P)  =  Do(0,/>)  =  -P  , 
e 

we  conjecture  that  the  same  holds  true  for  the  Poisson  channel  with  mean  square  constraint;  i.e., 
D(X,P)  =  D„(X,P).  Note  that  this  amounts  to  showing  that  D (\,P)=D0(l,P)  for  all  P  >0. 

Finally,  it  is  worth  noting  that  the  encoder  intensity  Xi  is  IF 9  ~  F r  -predictable  in  the  Poisson 
channel  model  considered  here.  Thus  our  capacity  results  are  all  results  for  the  Poisson  channel  with 
causal  feedback.  However,  the  possible  presence  of  channel  feedback  was  exploited  in  none  of  our 
proofs;  in  fact,  implicitly  or  explicitly  only  the  trivial  encoding  Xi(®)  =  9i  *s  used  in  the  proofs.  Thus 
the  capacity  results  presented  in  this  paper  are  equally  valid  for  the  no-feedback  Poisson  channel  with 
F0-predictable  encoder  intensity. 
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